Reward bonuses with gain scheduling inspired by iterative deepening search
نویسندگان
چکیده
This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order efficiently facilitate reinforcement learning search. While various have been designed date, they are analogous the depth-first and breadth-first search algorithms graph theory. paper, therefore, first designs two for each them. Then, heuristic gain scheduling is applied bonuses, inspired by iterative deepening search, which known inherit advantages algorithms. The proposed expected allow agent reach best solution deeper states gradually exploring unknown states. In three locomotion tasks with dense rewards simple sparse rewards, it shown that types contribute performance improvement different complementarily. addition, combining them scheduling, all can be accomplished high performance.
منابع مشابه
Enhanced Iterative-Deepening Search
|Iterative-deepening searchesmimic a breadthrst node expansion with a series of depthrst searches that operate with successively extended search horizons. They have been proposed as a simple way to reduce the space complexity of bestrst searches like A* from exponential to linear in the search depth. But there is more to iterative-deepening than just a reduction of storage space. As we show, th...
متن کاملAdaptive Parallel Iterative Deepening Search
Many of the artiicial intelligence techniques developed to date rely on heuristic search through large spaces. Unfortunately, the size of these spaces and the corresponding computational eeort reduce the applicability of otherwise novel and eeective algorithms. A number of parallel and distributed approaches to search have considerably improved the performance of the search process. Our goal is...
متن کاملDynamic Step Size Adjustment in Iterative Deepening Search
If an iterative deepening search (IDS) procedure has the property that solutions at a given iteration are also found at later iterations, it is possible to skip iterations without loss of correctness. We examine the conditions required for skipping to be worthwhile and give an algorithm for dynamically adapting the skipping to the behaviour of the search procedure. We consider the problem f wit...
متن کاملDepth-First Iterative-Deepening: An Optimal Admissible Tree Search
The complexities of various search algorithms are considered in terms of time, space, and cost of solution path. I t is known that breadth-first search requires too much space and depth-first search can use too much time and doesn't always find a cheapest path. A depth-first iteratiw-deepening algorithm is shown to be asymptotically optimal along all three dimensions for exponential pee searche...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Results in control and optimization
سال: 2023
ISSN: ['2666-7207']
DOI: https://doi.org/10.1016/j.rico.2023.100244